New Instability Results for High Dimensional Nearest Neighbor Search
نویسنده
چکیده
Consider a dataset of n(d) points generated independently from R according to a common p.d.f. fd with support(fd) = [0, 1] d and sup{fd([0, 1] )} growing sub-exponentially in d. We prove that: (i) if n(d) grows sub-exponentially in d, then, for any query point ~q ∈ [0, 1] and any ǫ > 0, the ratio of the distance between any two dataset points and ~q is less that 1 + ǫ with probability → 1 as d → ∞; (ii) if n(d) > [4(1 + ǫ)] for large d, then for all ~q ∈ [0, 1] (except a small subset) and any ǫ > 0, the distance ratio is less than 1 + ǫ with limiting probability strictly bounded away from one. Moreover, we provide preliminary results along the lines of (i) when fd = N(~μd,Σd).
منابع مشابه
Fast Nearest Neighbor Search in High-Dimensional Space
Similarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of high-dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest-neighbor se...
متن کاملWhat Is the Nearest Neighbor in High Dimensional Spaces?
Nearest neighbor search in high dimensional spaces is an interesting and important problem which is relevant for a wide variety of novel database applications. As recent results show, however, the problem is a very di cult one, not only with regards to the performance issue but also to the quality issue. In this paper, we discuss the quality issue and identify a new generalized notion of neares...
متن کاملFast Nearest-Neighbor Search Algorithms Based on High-Multidimensional Data
Similarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of high-dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore pre-compute the result of any nearest-neighbor s...
متن کاملIndexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space
ÐSimilarity search in multimedia databases requires an efficient support of nearest-neighbor search on a large set of highdimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest-neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest-neighbor se...
متن کاملOn Optimizing Nearest Neighbor Queries in High-Dimensional Spaces
Nearest-neighbor queries in high-dimensional space are of high importance in various applications, especially in content-based indexing of multimedia data. For an optimization of the query processing, accurate models for estimating the query processing costs are needed. In this paper, we propose a new cost model for nearest neighbor queries in high-dimensional space, which we apply to enhance t...
متن کاملOn Optimizing Nearest Neighbor Queries in High-Dimensional Data Spaces
Nearest-neighbor queries in high-dimensional space are of high importance in various applications, especially in content-based indexing of multimedia data. For an optimization of the query processing, accurate models for estimating the query processing costs are needed. In this paper, we propose a new cost model for nearest neighbor queries in high-dimensional space, which we apply to enhance t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Lett.
دوره 109 شماره
صفحات -
تاریخ انتشار 2009